home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Mac100% 1998 November
/
MAC100-1998-11.ISO.7z
/
MAC100-1998-11.ISO
/
オンラインソフト定点観測
/
ユーティリティ
/
Mops 3.2.sea
/
Mops 3.2
/
Mops source
/
PPC source
/
cg5
< prev
next >
Wrap
Text File
|
1998-06-22
|
34KB
|
1,231 lines
marker m__cg5
PPC?
[IF]
false constant debug?
[ELSE]
false constant debug?
[THEN]
0 value distance_code_moved
0 value ^fwd
0 value last_colon_defn
0 value fp_flags?
forward PPC_OBJ
PPC? not
[IF]
forward CALL_H ¥ in PPC mode, these are already forward defined in
forward LIT_ADDR ¥ qpClass, which gets loaded before the cgx files.
[THEN]
¥ on the PPC, normal alignment is 4-byte:
: ALIGN align4 ;
PPC?
[IF]
: CODE_ALIGN CDP #align4 -> CDP ;
[ELSE]
¥ CODE_ALIGN is in ppcOn68k
[THEN]
PPC?
[IF]
¥ we need LITERAL early in this file, since in a number of the following
¥ words we use postpone literal in which we really do want to compile
¥ a call to this PPC version of literal.
: LITERAL ¥ ( n -- ) Compiles a fetch of n as a literal.
¥ We just push onto cstk, hoping we can combine with an
¥ op at run time. If dpl is positive, n is a double number.
clear: opnd1
dpl 0>=
IF swap >lit: opnd1
opnd1 push
THEN
>lit: opnd1
opnd1 push ; immediate
[THEN]
¥ ========= FINALIZATION OF DEFINITIONS ===========
(* At semicolon time, there are a number of things we have to fix up in
the definition we just compiled. Once we've compiled the prolog and
epilog and added the const_data, if any, the final location of the
code is known. We can then "finalize" the definition. This includes
resolving any EXITs and LEAVEs and putting in the correct offsets for
calls to other words (these couldn't be determined before the code's
final location was determined).
To handle these various things, we use some pseudo-instructions to stand in
place of the final instructions we're going to put in those same locations
at finalization time. To finalize, we look through the whole definition
for these pseudo-instructions, and take the appropriate action.
For the pseudo-instructions, we have to use opcodes that can't ever be
used for real instructions. So we use the lmw and stmw instructions, since
these instructions won't always be available in hardware on PPC processors,
so we never ever want to generate them.
This gives us pseudo-instructions with a top byte in the range B8 - BF.
We also define our handler codes into this same range, since the two-byte
handler code appears on an aligned boundary. This will prevent a
handler code ever being mistaken for an instruction. Thus all our
handler codes and pseudo-instructions identify themselves.
So far we've defined these:
BAxx xxxx call to a Mops word (xx xxxx is code-relative offset)
BBxx xxxx floating-point flag bytes. This code is basically for
the disassembler, since it's aligned and 3 flag bytes
are plenty. It shouldn't come up in finalization.
BCxx all handler codes which have no boilerplate code (can't
be EXECUTEd). Unlike on the 68k, xx is unsigned and not
doubled (so we can have 256 codes if we need them).
BDxx all other handler codes, except for colon defns. They can
be EXECUTEd.
BE00 handler code for a colon defn
BE01 ditto, but means this is a forward defn.
BE02 marks the start of :loc code
BE03 marks the start of :mloc code
BE04 handler code for a :ppc_proc defn
BE40 method (note BD40 is an inline method - so the 40 always
marks a method)
(we'll reserve BExx for any further options on colon definitions.)
These next two can't ever appear inside a definition, so we give an error
if they're encountered during finalization.
BF01 handler code for SYSCALL and EXTERN
BF0B handler code for LIBRARY
BF02 0000 EXIT
BF03 xxxx conditional EXIT (xxxx is cond. branch opcode)
BF04 0000 LEAVE
BF05 0000 LOOP
BF06 0000 target of a forward defn. This marker is redundant,
but makes the decompiler output look more sensible.
BF08 xxxx unconditional branch. xxxx is offset (we only need 16 bits).
BF09 xxxx ELSE - branch. xxxx is initially the offset back to the
original conditional branch, in case we delete this branch
and need to adjust. Once the branch is resolved, and we
know it won't be deleted, xxxx becomes the branch offset
as for other unconditional branches. We can tell which
is which, since the first offset is negative and the
second positive.
BF0A replace with a literal load into r0 of the distance the code
is moved. Used in generating the addr of a location
within the current definition (since we don't know until
the end how far it might be moved).
*)
0 value CURR-DEF-CODE ¥ when we finish compiling a defn we move
¥ the code to make room for the prolog -
¥ this holds the addr of the code proper,
¥ which we need when we're finalizing.
0 value EXIT_LOC
ppc? not
[if]
0 value CONST_DATA_START ¥ PPC version is in pnuc1
[then]
0 value 1st_defn
0 value init_entry
forward add_const_data
forward set_CD_gpr#
forward FP_adjust
:f set_CD_gpr# ;f ¥ the real defn is in zArgs
: FIX_UNCOND_BRANCH { pos ¥ thisPos offs -- }
¥ due to back-equalization, we can get unconditional branches to
¥ other unconditional branches. We can simplify these here.
pos 2+ w@x -> offs
pos -> thisPos
BEGIN
offs NIF ¥ we must have an empty loop! Force this loop to end...
false
ELSE
offs ++> thisPos
thisPos w@ $ FFFE and $ BF08 =
THEN
WHILE
thisPos 2+ w@x -> offs
REPEAT
thisPos pos - $ 3FFFFFF and $ 48000000 or ¥ b instruction
pos !
;
: FIX_EXIT { pos -- }
$ 48000000 ¥ b instruction
exit_loc pos - $ 00FFFFFF and
or pos !
;
: FIX_CONDITIONAL_EXIT { pos -- }
pos @ 16 << ¥ cond branch PPC opcode
exit_loc pos - $ 0000FFFF and
or pos !
;
: FIX_LEAVE { pos ¥ target -- }
pos -> target
BEGIN ¥ scan forward to find the next LOOP or +LOOP
4 ++> target
target w@ $ BF05 =
UNTIL
$ 48000000 ¥ uncond branch opcode
target pos - $ 00FFFFFF and
or pos !
;
: FIX_LOOP { pos -- } ¥ replaces the LOOP pseudo-op with the first instruction
¥ of the LOOP windup sequence (see comments there)
$ 82B10000 pos ! ¥ lwz r21/I, (r17/RP) restore I
;
: FIX_CALL { pos -- }
pos @ $ 00FFFFFF and ¥ code area offset to called location
code_start + pos - ¥ make relative to locn of call
$ 03FFFFFF and $ 48000001 or ¥ construct bl instrn
pos !
;
: COMPILE_DISTANCE_MOVED { pos -- }
$ 38000000 distance_code_moved or pos ! ¥ addi r0, 0, dddd
;
: FIX_VARIANT_OP { pos ¥ vop -- } ¥ handles BFxx opcodes
pos 1+ c@ -> vop
vop
SELECT[ $ 02 ]=> pos fix_exit
[ $ 03 ]=> pos fix_conditional_exit
[ $ 04 ]=> pos fix_leave
[ $ 05 ]=> pos fix_loop
[ $ 06 ]=> ¥ ignore this one
[ $ 08 ],
[ $ 09 ]=> pos fix_uncond_branch
[ $ 0A ]=> pos compile_distance_moved
DEFAULT=> cr .h ." illegal variant opcode during finalization"
]SELECT
;
: FIX_1_OP { pos op -- }
op
SELECT[ $ BA ]=> pos fix_call
[ $ BF ]=> pos fix_variant_op
DEFAULT=> cr .h ." illegal opcode during finalization"
]SELECT
;
false value inhibit_finalization?
: FINALIZE_DEFN { ¥ pos op -- }
inhibit_finalization? ?EXIT
curr-def-code -> pos
BEGIN
pos CDP >= ?EXIT
pos c@ -> op
op $ F8 and $ B8 =
IF pos op fix_1_op
THEN
4 ++> pos
AGAIN
;
¥ ========= COMPILATION OF CALLS, EXIT etc. ===========
false value LEAF? ¥ normally set true at colon time, then set false
¥ when we do a call. Thus at semicolon time,
¥ if it's still true, we know this was a leaf
¥ routine.
¥ false value 2LEV? ¥ we may use this similarly, to detect defns
¥ which aren't leaves, but only call leaves.
¥ But we're not using it yet.
false value CTR_CLOBBERED? ¥ we use this in deciding if we can use
¥ a branch on the ctr in DO..LOOP and
¥ FOR..NEXT.
0 value FP_FLAGS ¥ we keep these here till semicolon time, when
¥ we move the definition and put them in the
¥ right place.
: PUSH_LR
LR>R0 code,
0 select: GPRs RP_reg true compPush: GPRs
;
: PULL_LR
0 select: GPRs
RP_reg 0 4 compPull: GPRs
R0>LR code,
;
: LR>TREG ¥ can be used in a defn if we know what we're doing
LR>R0 code, ; immediate
: TREG>LR
R0>LR code, ; immediate
: PLentry { ¥ reg# addr -- } ¥ handle entry with named parms/locals
(* All we do here is set the flag bytes for this definition, and the initial
cstk. Everything else is done by compile_prolog, and we don't call that
until the end, when we know if this is a leaf proc or not.
We don't set the flag bytes if this is the resolution of a forward
definition, since the header's been fixed up already (and isn't here anyway).
(This is handled in qpCond.)
The flag bytes are described in the comments near the start of cg1.
*)
forward?
NIF
curr-def 2 - -> addr
#P 15 and 4 <<
#PL 15 and or
addr 1+ c!
#FP 15 and 4 <<
#FPL 15 and or
or> fp_flags
THEN
¥ now the initial cstk is different to before, if there were parms
#P gpr_call_cnt max #P - setup_cstk
#FP fpr_call_cnt max #FP - setup_fcstk
;
: COMPILE_PROLOG&EPILOG { #gprs_to_save #fprs_to_save ¥ svCDP src dst offs len -- }
leaf? ¥ leaf procs don't save/restore anything - it's all done in the caller.
¥ So there's no prolog or epilog. Here we just have to initialize
¥ a couple of values which the caller needs, then get out.
IF
curr-def -> curr-def-code
CDP -> exit_loc
EXIT
THEN
¥ Now we work out the prolog size, which will be the distance we need to
¥ move the code up memory to make room for the prolog. See the comments
¥ in cg3 for details on the prolog, which will make sense of these
¥ machinations.
local?
IF 8
ELSE #gprs_to_save #P + 4*
#fprs_to_save #FP + 4* +
8 +
[ ppc? ] [if]
CD_gpr# IF 12 + THEN
[then]
method? IF 8 + THEN ¥ trying something here
THEN
¥ method? IF 8 + THEN
tempObj_framesize IF 4+ THEN
-> offs ¥ offs will be the prolog size
forward?
IF #P gpr_call_cnt > ¥ for forward defns, if the no of named parms
IF 4 ++> offs ¥ is greater than call_cnt, some parms will
¥ have to be pulled from mem, so we'll need
¥ an extra 4 bytes for the SP adjustment.
THEN
#FP fpr_call_cnt >
IF 4 ++> offs
THEN
THEN
curr-def -> src
src offs + -> dst
offs
IF
CDP src - -> len
src dst len move
src offs erase ¥ debugging only
offs ++> CDP
offs ++> distance_code_moved
THEN
CDP -> svCDP src -> CDP
init_GPRs init_FPRs ¥ all regs invalid at the start!
local?
IF 0 0 0 0
ELSE
#gprs_to_save #P #fprs_to_save #FP
THEN
false method? compile_prolog
svCDP -> CDP
CDP -> exit_loc ¥ EXITs resolve to here
false method? compile_epilog
¥ now our default entry point for the whole program is straight after the
¥ prolog of the last definition compiled (since on initial entry the
¥ RP isn't set up, so we mustn't execute a prolog!)
init_entry IF offs ++> init_entry THEN
dst -> curr-def-code
;
(* Colon uses a new header format incorporating two flag bytes in addition to
what we use on the 68k, and also has to observe 4-byte alignment for the
code. See comments at the start of cg1 for details.
*)
: PPC_ENTRY { fwd? -- 300 }
?exec
fwd? -> forward? ¥ transfer to our global so we'll
¥ take special action if we get { ... }
CDP -> backstop_CDP
CDP -> fetch_backstop
0 -> basic_block_start ¥ the idea is that regs passed in will
¥ have 0 in their opCDP fields, and I
¥ don't want to block cascades
¥ unnecessarily. Calling the basic block
¥ start zero is apparently harmless.
0 -> max_called_#PL
0 -> max_called_#FPL
0 -> stk_offset 0 -> distance_code_moved
¥ [ ppc? ] [if] 0 -> CD_gpr# [then]
false -> will_skip?
false -> ctr_clobbered?
0 -> fp_flags
clear: eq_ranges clear: const_data
gpr_call_cnt setup_cstk ¥ will be redone if we get { ... }
fpr_call_cnt setup_fcstk
fwd?
IF false
ELSE optimize_leaf_calls?
THEN -> leaf?
¥ only set leaf? flag if optimizing leaf calls, and this isn't
¥ a forward definition
¥ 0 >size: control_stk 0 >size: control_flags
-1 -> gpr_rtn_cnt ¥ means we haven't set its specific value yet
-1 -> fpr_rtn_cnt
¥ release: const_data new: const_data
0 -> stk_offset
[ PPC? ]
[IF]
-1 -> state ¥ same as postpone ] - the only thing we really
¥ need from the 68k (:)
[ELSE]
(:)
[THEN]
CDP -> curr-def ¥ the entry prolog gets added later
300 ¥ security marker
;
PPC?
[IF]
: :
CDP -> last_colon_defn ¥ used by compile_call in checking where
¥ a call is coming from
CDP -> const_data_start
local? NIF CDP -> CD_gpr_loc THEN
1st_defn NIF CDP -> 1st_defn THEN
ppc_header
$ BE000000 code, ¥ handler code for PPC colon defns,
¥ and initial flag bytes
false -> method?
false -> noname?
0 >size: control_stk 0 >size: control_flags
false ppc_entry ¥ handle ppc proc entry
postpone hide ¥ new word is hidden until defn end
; ppc_immediate
: :NONAME ( -- xt 300 )
CDP -> const_data_start
$ BE000000 code, ¥ no hdr, just handler code for PPC colon defns,
¥ and initial flag bytes
CDP 2- ¥ xt = addr of flag bytes
false -> method?
true -> noname?
false ppc_entry ¥ handle ppc proc entry
;
[ELSE]
: :
PPC?
IF CDP -> const_data_start
1st_defn NIF CDP -> 1st_defn THEN
ppc_header
$ BE000000 code, ¥ handler code for PPC colon defns,
¥ and initial flag bytes
false -> method?
false -> noname?
0 >size: control_stk 0 >size: control_flags
false ppc_entry ¥ handle ppc proc entry
postpone hide ¥ new word is hidden until defn end
ELSE
postpone :
THEN
; immediate
[THEN]
: CompExit
¥ store_all_pending
¥ tail_optimize? ?EXIT
BLR code,
;
: CLRCOMP
0 -> #PL 0 -> #FPL
false -> method? false -> noname?
false -> mloc?
0 -> tempObj_frameSize
false -> fltFlg
;
PPC? [IF]
forward releaseTemps ¥ in zClass
[THEN]
: (;) { ^flags ¥ loc_addr #gprs_to_save #fprs_to_save -- }
¥ factors out common code for ; and ;m
debug? if
." (;) here" cr
then
false -> fp_flags?
(* First, we call set_constData_reg which decides if we're going to
use one of our locals regs as a base reg for addressing the const
data area. Then if we do this, we make this defn non-leaf. We also
do this if there are temp objects. We don't want the extra complexity
of const data reg or temp object management in the two types of
calling sequence - one is quite enough!
*)
¥ set_constData_reg ¥ sets CD_gpr# non-zero if we're using it
¥ CD_gpr# IF false -> leaf? THEN
tempObj_framesize
IF false -> leaf?
[ ppc? ] [if]
releaseTemps
[then]
THEN
¥ now we equalize the stacks according to what we want for a
¥ return from this definition:
get_rtn_cnts simple_equalize
¥ now we set up the first flag byte for this defn, unless it was forward.
¥ The second flag byte was set up at PLentry. We also set up the FP flag
¥ bytes if necessary.
forward?
NIF
fpr_call_cnt fpr_rtn_cnt <> #FPL 0<> or -> fp_flags?
leaf? $ 80 and
ctr_clobbered? $ 40 and or
fp_flags? $ 10 and or
gpr_rtn_cnt or
^flags c!
fp_flags?
IF fpr_rtn_cnt 8 << or> fp_flags THEN
THEN
#PL max_called_#PL max -> #GPRs_to_save
#FPL max_called_#FPL max -> #FPRs_to_save
#GPRs_to_save #FPRs_to_save compile_prolog&epilog
fp_adjust ¥ can move the defn
add_const_data ¥ ditto
compExit
finalize_defn
(* Now if init_entry has been set nonzero, it means this defn has asked to be
set as the initial entry, by putting the target addr in init_entry.
We handle the initial entry in a simple way - rather than trying to do
anything clever with the PEF, we just put a branch at the start of the
code. Here if necessary, we resolve that branch to init_entry. Note we
don't use resolve_branch, since that's for use within a defn, and assumes a
16-bit offset, which surely won't be enouyh here.
*)
init_entry
IF init_entry code_start - $ 03FFFFFF and ¥ offset - should be positive
code_start @ $ FC000000 and or
code_start !
0 -> init_entry
THEN
noname? forward? or NIF postpone reveal THEN
local? NIF clrComp 0 -> #P 0 -> #FP
[ ppc? ] [if] 0 -> CD_gpr# [then]
THEN
0 -> state
false -> noname? false -> mloc?
[ ppc? ] [if]
curr-def-code 32 - CDP over - 32 + fix_caches
[then]
;
PPC?
[IF]
: ; ppc_immediate
curr-def 2- (;)
300 ?defn
;
: ;proc
curr-def 2- (;)
306 ?defn
; ppc_immediate
: EXIT
get_rtn_cnts simple_equalize
$ BF020000 code, ¥ opcode BF02 = EXIT. Will be changed
¥ to an uncond branch to the epilog.
size: control_flags
IF pop: control_flags 4 or push: control_flags THEN
¥ this basic block is now dead!
CDP -> basic_block_start ¥ to block hoists - but not much
¥ point!
; ppc_immediate
: ?EXIT
" IF EXIT THEN" evaluate
; ppc_immediate
: 0EXIT
" NIF EXIT THEN" evaluate
; ppc_immediate
[ELSE]
: ;
PPC?
IF
curr-def 2- (;)
¥ 300 ?defn
300 <> IF cr ." warning - unbalanced!!" cr THEN
ELSE
postpone ;
THEN
; immediate
: EXIT
PPC?
IF get_rtn_cnts simple_equalize
$ BF020000 code, ¥ opcode BF02 = EXIT. Will be changed
¥ to an uncond branch to the epilog.
size: control_flags
IF pop: control_flags 4 or push: control_flags THEN
¥ this basic block is now dead!
CDP -> basic_block_start ¥ to block hoists - but not much
¥ point!
ELSE
postpone exit
THEN
; immediate
: ?EXIT
PPC?
IF " IF EXIT THEN" evaluate
ELSE
postpone ?exit
THEN ; immediate
: 0EXIT
PPC?
IF " NIF EXIT THEN" evaluate
ELSE
postpone 0exit
THEN ; immediate
[THEN]
0 value sv_curr-def
0 value sv_#PL
0 value sv_#P
0 value sv_const_data_start
0 value sv_fp_flags
0 value sv_CD_gpr#
(* (suspend_compilation) and (resume_compilation) are called by [ and ] respectively.
The problem we have to solve is that when resuming compilation we have to
restore the code generator state to the same as it was when compilation was
suspended - i.e. all the reg contents, cstk, etc etc. To simpilfy this a bit,
we don't try to save and restore the world, but treat suspension and resumption
like calling a word with 2 parms and 2 results. So at suspension time, we
do a simple_equalize with 2 cells, then at resumption we just initialize
the regs and cstk back to that state. This is a bit simpler than what we do
at call_h, since we're not really calling another word.
*)
: (suspend_compilation)
state 0EXIT
curr-def -> sv_curr-def
#PL -> sv_#PL #P -> sv_#P
CD_gpr# -> sv_CD_gpr#
const_data_start -> sv_const_data_start
reset: const_data const_data ->: sv_const_data
clear: const_data
fp_flags -> sv_fp_flags
0 -> fp_flags
0 -> #PL 0 -> #P
2 -1 simple_equalize
;
: (resume_compilation)
sv_curr-def -> curr-def
sv_#PL -> #PL sv_#P -> #P
sv_CD_gpr# -> CD_gpr#
sv_const_data_start -> const_data_start
sv_const_data ->: const_data clear: sv_const_data
sv_fp_flags -> fp_flags
2 setup_cstk update_refcnts
;
¥ =================== LITERAL ADDRESSES ===================
: (LITADDR) { gpr# offs -- }
gpr# >gpr: opnd1
offs NIF ¥ this can happen in object binding, so we'll save time
¥ and get rid of it here
opnd1 push EXIT
THEN
offs >lit: opnd2
otAdd -> operation
compRegLit
res1 push
;
(* LITADDR_H handles the generation of an address IN THE DATA AREA.
(on the 68k we didn't have to distinguish).
At the xt+2 there will be a reloc pointer to the right place in the
data area (xt=cfa).
(For addresses in the code area, use lit_addr)
*)
: LITADDR_H { xt -- } ¥ We handle this simply as an add of the base reg and
¥ the displacement.
xt 2+ @b&d (litaddr)
;
¥ LIT_ADDR just generates an address, regardless of where it is.
:f LIT_ADDR ( addr -- )
b&d (litaddr)
;f
¥ =========== CONSTANT DATA (stored in code area) =============
: CODE_ADDR_IN_CURR_DEF ( addr -- )
lit_addr
$ BF0A0000 code, ¥ at finalization time, is replaced with a load
¥ into r0 of the distance the code is moved
" treg +" evaluate
;
(* CONST_DATA_REF compiles a push of the addr of the current location
in the constant data, which we use for literal strings, floating
point literals, etc etc.
For example, if we have ( addr len ) referring to the data we
want to add, the normal way to add it would be this:
const_data_ref ¥ compiles push of addr at run time
( addr len ) add: const_data
The constant data for the current definition will be placed straight
before it in the dictionary. Thus it will be read-only in installed
apps. While we're compiling the defn we put the const data in the
bytestring const_data, then fix it up at semicolon time, via
ADD_CONST_DATA.
This mechanism replaces literal strings and w@(IP) stuff that used
to go in the middle of the code itself.
*)
: CONST_DATA_REF { ¥ gpr# displ -- }
0 -> svOpcode
const_data_start pos: const_data + ¥ the addr we want
b&d -> displ -> gpr#
[ ppc? ] [if]
displ true 16bits? nip
NIF
CD_gpr# NIF set_CD_gpr# THEN ¥ sets CD_gpr#
CD_gpr#
IF ¥ if successful, we use it
false -> leaf?
CD_gpr# -> gpr#
const_data_start CD_gpr_loc - pos: const_data +
-> displ
THEN
THEN
[then]
gpr# displ (litaddr)
;
ppc? not
[IF]
0 value prev_link ¥ saves prev link to current method - in
[THEN] ¥ ppc mode, it's in cg-class which gets
¥ loaded first
¥ FP_ADJUST and ADD_CONST_DATA may put some info at or before the start of
¥ the definition, and move it up in memory to make room. These can't be
¥ called until semicolon time, since all their info mightn't be there
¥ till then.
:f FP_ADJUST { ¥ dist -- }
fp_flags? 0EXIT
curr-def ¥ src
dup 4+ ¥ dst
CDP curr-def - ¥ len for move
move
¥ now we adjust some other things, since we just moved the definition:
4 ++> distance_code_moved
4 ++> curr-def 4 ++> curr-def-code
4 ++> CDP 4 ++> exit_loc
init_entry IF 4 ++> init_entry THEN
¥ now we move the FP flag bytes into the area we made for it
fp_flags $ BB000000 or curr-def 4- !
;f
:f ADD_CONST_DATA { ¥ dist -- }
reset: const_data
len: const_data 0EXIT
¥ first we move the whole defn down to make room for the const data.
¥ At this point const_data_start will point to the link field of
¥ the current definition, or our marker word if it's a forward
¥ or :noname definition.
len: const_data #align4 -> dist
forward? noname? or mloc? or
NIF
method?
IF
[ ppc? ] [if]
dist negate ^meth_link +! ¥ update method's link field
[else]
dist negate const_data_start 4+ +! ¥ update method's link field
[then]
dist prev_link +!
ELSE
dist
const_data_start 4+ ¥ start of name field - note, can't use
¥ "curr-def >name" as it's a PPC header!
thread +! ¥ incr CONTEXT entry by dist
dist negate const_data_start +! ¥ and update dic link for this defn
dist ++> latest
THEN
THEN
const_data_start ¥ src
dup dist + ¥ dst
CDP const_data_start - ¥ len for move
move
¥ now we adjust some other things, since we just moved the definition:
dist ++> distance_code_moved
dist ++> curr-def dist ++> curr-def-code
dist ++> CDP dist ++> exit_loc
init_entry IF dist ++> init_entry THEN
¥ now we copy the const data into the area we made for it
all: const_data
const_data_start swap move
;f
PPC? [IF]
: RELOC>CONST_DATA ¥ ( xt -- ) Just needed to support XTS{ in zBase,
¥ since at that point we can't send messages yet so we
¥ can't manipulate const_data directly.
0 +L: const_data
^1st: const_data 4-
reloc!
;
[THEN]
(* CALL_EXTERN handles an external call. This requires that we set things up as
the PowerPC volume of IM says:
1. We have a pointer which is resolved by the CFM - this will
be the address of a transition vector. This pointer will be in the
data area (since it gets changed), and has a reloc addr pointing
to it in the code area, which belongs to the SYSCALL or EXTERN
word.
We have to allow for new external calls to be asked for, then
executed straight away, so we use a scheme where when we do an
external call, we check whether the pointer has been resolved
yet, and resolve it if it hasn't. We can easily tell, since we
initialize each pointer to nilP, which is an illegal address.
This test and the call to FindSymbol to resolve it, is in
get_transition_vector which is called at the beginning of our
external call sequence.
We could save a couple of instructions by pre-resolving symbols that
are already in the dictionary image, but it's not worth it - it's
better to use just one scheme, and we do need to be able to resolve
on demand, so that's the way we do it.
The transition vector has 2 addresses - the addr for us to branch to,
and the new RTOC value. The dest addr has to be loaded into the CTR
or the LR for us to use it as a branch target. We use the CTR - see
below for the reason for this. We want to load the dest addr as early
as possible so that instruction fetching won't stall, so we do this
part of the setup before we equalize the stack - during the equalization
nothing needs the CTR anyway.
We use r12 for the addr of the transition vector itself, as IM says.
This also won't get messed with during equalization.
We set up r12 and the CTR in get_transition_vector, as well as resolving
the symbol as described above. Factoring as much as possible into
get_transition_vector saves code space in the call sequence for
external calls.
So, as well as a bit of housekeeping, the main thing that CALL_EXTERN
does is to compile a call to get_transition_vector. CALL_EXTERN then
passes 1 as an "xt" to CALL_H. 1 can never be a real xt, since they must
be even, so this tells call_h that this is an external call. CALL_H
looks after everything from here on, including the stack equalization.
2. The first thing call_h does is pass 1 to EQUALIZE_FOR_CALL (in the
equalization section). This gets the parameters into the right regs (and
the parameter area, if necessary), as needed for external calls.
IM envisages that setting the SP is already done by the prolog
of the current routine, on behalf of all external calls that this
routine makes. The parameter area is big enough for the call with the
most parameters, and the others leave some unused space below the parm
area (actually higher in memory). The parm area for each call must come
immediately below the linkage area, so the callee can find it.
But in Mops we have a separate data stack pointer, so we simply
set up a linkage area for external calls using the system SP (gpr1) at
startup time, and never change it after that.
3. call_h then calls COMPILE_EXTERN_CALL to compile the actual call. To do
this, we store our own RTOC into the linkage area (actually this is done
once and for all at startup since we have a permanent frame for external
calls), and load RTOC from the transition vector (still pointed to by r12).
We then bctrl (branch and link to count register) to call the external
code. (We could equally well have used the LR - see below.)
Note: the standard sequence for cross-TOC calls in Metrowerks C is as
follows. We do much the same, but in a different order - in particular
we grab the dest addr and get it into the CTR as early as possible, before
we normalize the stack etc., and we move the SP to allocate the parm and
linkage areas on each call.
We could equally well have used the LR instead of the CTR. MW have to use
the CTR since they've done a bl to the out-of-line code, and have to preserve
the LR. But the IBM manual recommends using the CTR for computed branches
like this, to make life easier for debuggers etc, so that's what we'll do.
inline:
bl xxx
lwz r2/TOC, $14(r1/SP)
...
xxx lwz r12, <offs>(r2/TOC) / TOC entry is a pointer to transfer vector
stw r2/TOC, $14(r1/SP) / Save RTOC
lwz r0, (r12) / 1st entry in TV is destination addr
lwz r2/TOC, $4(r12) / 2nd entry is new TOC addr - put in RTOC
mtspr CTR, r0 / dest addr to CTR
bctr / branch there
*)
: CALL_EXTERN { ^extern -- }
?comp
false -> leaf? true -> ctr_clobbered?
^extern lit_addr ¥ generate the addr of the extern for
¥ get_transition_vector
" get_transition_vector" evaluate ¥ (which isn't defined till setup)
¥ at run time, it resolves if nec, gets
¥ TV addr to r12 and dest addr to ctr
^extern c@ -> #extern_parm_cells
^extern 1+ c@ -> #extern_result_cells
^extern 2+ c@ -> #extern_FP_parms
^extern 3 + c@ -> #extern_FP_results
^extern 4+ w@ -> extern_mask
1 call_h ¥ 1 means we're doing an extern call, and
¥ causes compile_extern_call to be called
;
(* (TOC_CALL) is for internal use only - it compiles a call to one
of our predefined TOC symbols. We need these in order to be able
to set everything up at initial entry time.
*)
:f (TOC_CALL) { #parm-cells #result-cells offs -- }
#result-cells -> #extern_result_cells
#parm-cells -> #extern_parm_cells
0 -> #extern_FP_parms
0 -> #extern_FP_results
$ 81820000 offs or code,
¥ lwz r12, <offs>(r2) - get trans vect addr to r12
$ 800C0000 code, ¥ lwz r0, (r12) - get dest addr to r0
$ 7C0903A6 code, ¥ mtctr r0 - and then to ctr
1 call_h
false -> leaf? true -> ctr_clobbered?
;f
: COMPILE_EXTERN_CALL
(* compiles the instructions to do an external call. r12 points to the
transition vector in the called container. Note that we have our frame
for external calls permanently set up, and never change gpr1. Thus we save
our RTOC when we set up the frame in SETUP, and don't need to do it here
on each call.
*)
$ 90410014 code, ¥ stw RTOC,20(sys_SP) - save RTOC in frame
$ 804C0004 code, ¥ lwz RTOC, 4(r12) - new TOC addr to RTOC
$ 4E800421 code, ¥ bctrl - call external code
$ 80410014 code, ¥ lwz RTOC, $14(SP) - on return, restore RTOC
;
PPC?
[IF]
: COMPILE_CALL { ^code ¥ seg# displ -- }
(* Compiles a normal Mops call. Mostly this will be a straight
bl to the target, since the effective 24 bits displacement can
get us anywhere we need to go, within the current segment.
For this we here use pseudo-opcode BA with the offset relative
to code_start in the lo 3 bytes, and resolve at finalization time.
(The use of code_start is just a convenience since it's easier than
working out what's the right segment at finalization time, and
we're always compiling in the main dic anyway).
The reason for doing things this way, is that the definition can get
moved at the beginning of finalization, when we compile the prolog.
So the offset for the branch can't be worked out until after that.
But if the call is out of the segment, we have to get the addr
into the CTR and compile a bctrl. In this case it won't be affected
if the defn moves, so we handle the whole thing here.
*)
CDP addr>S&D drop ¥ leave CDP's seg#
^code addr>S&D -> displ -> seg#
seg# =
IF ¥ local branch - we use $BA code - note we make it relative
¥ to code_start so offset will always fit in 24 bits. We're
¥ always compiling in the main dic so this will always be OK.
$ BA000000
^code code_start - or code, ¥ offset MUST be positive -
¥ if for some reason it's not, that's
¥ a bug, and we'll trap on an illegal
¥ instruction!
ELSE ¥ The call is into a different segment. We'll normally dispatch
¥ via the segment table, and set up r15-16 for the called segment.
¥ There are a couple of special cases that will save a few
¥ instructions.
seg# 8 =
IF ¥ We're calling the main dic, so we know we can use r13.
mainCode_reg
displ code_start + nuc_code_start - half_displ_range -
¥ displ is rel to code_start, but reg points into
¥ the middle of the 64k range starting at
¥ nuc_code_start
(litAddr) ¥ compile code to generate addr, push reg reference
ELSE
seg# segTable_entry 4+ lit_addr
postpone @
displ postpone literal postpone +
THEN
opnd1 pop
gpr: opnd1 21 <<
$ 7C0903A6 or code, ¥ mtctr rN
$ 4E800421 code, ¥ bctrl
true -> ctr_clobbered? ¥ we used the ctr
free: opnd1
THEN
;
[ELSE]
: COMPILE_CALL { ^code -- }
$ BA000000
^code code_start - or code,
;
¥ In the PPC compilation, these next 2 are in pnuc4.
: XT? { xt ¥ code -- xt b } ¥ Checks if xt is really a legal xt.
xt ¥ we'll return this
xt 2- 3 and IF false EXIT THEN ¥ 2 less must be aligned
xt 2- c@ -> code ¥ top byte of handler
code $ BD = code $ BE = or
;
: ?XT ¥ ( xt -- xt )
xt? NIF ." not a valid xt" 1 die THEN
;
[THEN]
:f CALL_H { xt ¥ svLocal? cFloat? cFwd? ^code -- }
local? -> svLocal? false -> local? false -> leaf?
false -> cFloat? false -> cFwd?
xt 1 = ¥ 1 is valid as an "xt" here
NIF xt ?xt drop ¥ otherwise must be a legal xt
xt 2+ -> ^code ¥ normal case
xt 2- w@ $ BE01 = -> cFwd? ¥ $ BE01 handler code means forward
THEN
xt equalize_for_call ¥ also does method handling if this is a method
¥ (see e.g. setup_normal_call in cg3)
xt 1 and
NIF
¥ normal Mops call - we use temp opcode BA and resolve at the end of the defn
¥ First we have to check if it's an FP defn:
xt c@ $ 10 and ¥ floating?
IF true -> cFloat? 4 ++> ^code THEN
^code compile_call
¥ now we set things up as they need to be on return from the word we called
cLeaf? IF true cMeth? compile_epilog THEN
cFwd?
IF true -> ctr_clobbered? ¥ we have to be conservative
THEN
xt c@
dup $ 40 and IF true -> ctr_clobbered? THEN
$ F and ¥ # GPRs with results
cFloat?
IF ¥ called word has FP flag bytes
xt 4+ c@ $ F and ¥ in which case #FPRs with results is given
ELSE
fpr_call_cnt ¥ otherwise we use the default - same as entry
THEN
( #fprs ) reset_fcstk
( #gprs ) reset_cstk
debug? if
." normal call compiled. cstks after:" cr
printall: cstk printall: fcstk
." fpr_rtn_cnt " fpr_rtn_cnt . cr
then
update_refcnts
false -> cLeaf? false -> cMeth? ¥ finished with them now
ELSE ¥ external call
compile_extern_call
#extern_result_cells reset_cstk
#extern_FP_results reset_fcstk
THEN
CDP -> backstop_CDP CDP -> basic_block_start
svLocal? -> local?
;f
: CALLSTR_H call_h ;